Mori. Not. R. Astron. Soc. 000, 000-000 (0000) Printed 1 February 2008 (MN I^Tr^X style file vl.4) 

The Durham/UKST Galaxy Redshift Survey - III. 

Large Scale Structure via the 2-Point Correlation Function. 

A. Ratcliffe 1 , T. Shanks 1 , Q.A. Parker 2 and R. Fong 1 

1 Physics Deptartment, University of Durham, South Road, Durham, DH1 3LE. 
2 Anglo- Australian Observatory, Coonabarabran, NSW 2357, Australia. 



1^ 

^ , 1 February 2008 

On 



(N 



> 

(N 
(N 
(N 

O 

^3 

9* 

6 

> 1 INTRODUCTION 



ABSTRACT 

We have investigated the statistical clustering properties of galaxies by calculating the 
2-point galaxy correlation function from the Durham/UKST Galaxy Redshift Survey. 
This survey is magnitude limited to bj ~ 17, contains ~2500 galaxies sampled at a rate 
of one on three and surveys a ~4 x 10 6 (ft. _1 Mpc) 3 volume of space. We have empirically 
determined the optimal method of estimating the 2-point correlation function from just 
such a magnitude limited survey. Applying our methods to this survey, we find that our 
redshift space results agree well with those from previous optical surveys. In particular, 
we confirm the previously claimed detections of large scale power out to ~40/i _1 Mpc 
scales. We compare with two common models of cosmological structure formation 
and find that our 2-point correlation function has power significantly in excess of the 
standard cold dark matter model in the 10-30ft-~ 1 Mpc region. We therefore support 
the observational results of the APM galaxy survey. Given that only the redshift space 
clustering can be measured directly we use standard modelling methods and indirectly 
estimate the real space 2-point correlation function. This real space 2-point correlation 
function has a lower amplitude than the redshift space one but a steeper slope. 

Key words: galaxies: clusters - galaxies: general - cosmology: observations - large- 
scale structure of Universe. 
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Historically, the spatial 2-point correlation function, £, has 
played a central role in the quantitative measurement of 
the strength of galaxy clustering. It provides fundamental 
information about the galaxy distribution in that sense that 
it is the Fourier transform partner of the power spectrum of 
the density fluctuations. This statistic is also both easy to 
compute, although quite laborious, and easy to understand, 
with a direct probabilistic interpretation (e.g. Peebles 1980). 

The usual methods of estimating the spatial 2-point cor- 
relation function are either from t he deprojecti on of the an- 
gular correlation function, w(6), (Limber 1954) or by direct 
estimatio n of the observed galaxy dis tribution from redshift 
surveys (a.g. Davis fc Peebles 1983J). Both methods have 



problems; the deprojection techniques are generally unstable 
and require additional galaxy number-distance information, 
while redshift surveys (by construction) have their galaxy 
distance estimates distorted by the galaxy peculiar veloc- 
ity field. Therefore, they measure the real space correlation 
function after convolution with this field. 

The initial clustering results, redshift maps, etc. of the 
Durham/UKST Galaxy Redshift Survey were summar ized 
in the first paper of this series (Ratcliffe et al. 1996a). In 



this paper we present a detailed analysis of the 2-point cor- 
relation function clustering techniques and results from this 
optically selected survey. We briefly describe our survey in 
Section g. The different methods of estimating the 2-point 
correlation function from a magnitude limited redshift sur- 
vey are described and tested in Section M. In Section r& we use 
the optimal method available to estimate the galaxy 2-point 
correlation function from the Durham/UKST survey and 
compare with the results from other galaxy redshift surveys 
and models of structure formation. The projected 2-point 
correlation function is described and estimated in Section tjl 
Finally, we summarize our conclusions from this analysis in 
Section ra. 



2 THE DURHAM/UKST GALAXY REDSHIFT 
SURVEY 

The Durham/UKST Galaxy Redshift Survey was con- 
structed usin g the FLAIR fibre optic system (Parker & 
Watson 199J1 ) on the 1.2m UK Schmidt Telescope at Sid- 
ing Spring, Australia. This survey uses the astrometry 
and photometry from the Edinburgh/Durham Southern 
Galaxy Catalogue (EDSGC; Collins, Heydon-Dumbleton 
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& MacGillivray 1988; Collins, Nichol & Lumsden 1992) 
and was completed in 1995 after a 3-yr observing pro- 
gramme. The survey itself covers a ~20° x 75° area cen- 
tered on the South Galactic Pole (60 UKST plates) and 
is sparse sampled at a rate of one in three of the galax- 
ies to bj ~ 17 mag. The resulting survey contains ~2500 
redshifts, probes to a depth greater than 300/i _1 Mpc, with 
a median depth of ~150/i -1 Mpc, and surveys a volume of 
space ~4 x 10 6 (/i _1 Mpc) 3 . 

The survey is >75 per cent complete to the nominal 
magnitude limit of bj = 17.0 mag. This incompleteness 
was mainly caused by poor observing conditions, intrinsi- 
cally low throughput fibres and other various observational 
effects. In a comparison with ~150 published galaxy veloci- 
ties (Peterson et al. 1986; Fairall & Jones 1988; Metcalfe et 
al. 1989; da Costa et al. 1991) our measured redshifts had 
negligible offset and were accurate to ±150 kms -1 . The scat- 
ter in the EDSGC magnitudes has be en estimated at ±0.22 
mags (Metcalfe, Fong & Shanks 1995) for a sample of ~100 



galaxies. This scatter has been confirmed by a preliminary 
analysis of a larger sample of high quality CCD photometry. 
All of these observational details are discussed further in a 
forthcoming data paper (Ratcliffe et al., in preparation). 



3 ESTIMATING THE 2-POINT 

CORRELATION FUNCTION FROM A 
MAGNITUDE LIMITED SURVEY 

In a volume limited, fair sample of the Universe (where the 
edge effects of the galaxy survey can be neglected) an un- 
biased estimate of the 2-point correlation function, £ (x), at 
separation x, is given by 



tW RR(x) \n D ) 



(1) 



where DD(x) and RR(x) are the data-data and random- 
random pair counts at separation x and no & fin are the 
mean densities of the data (galaxy) & random surveys, re- 
spectively. However, for an apparent magnitude limited sur- 
vey over a given fraction of the sky things are not so simple. 
To estimate £ one has to deal with a falling radial number 
density, how best to treat the edges of the survey and the ef- 
fects of being forced to calculate the mean density internally 
from the survey itself. These problems manifest themselves 
as the estimator we use to calculate £ and the weighting we 
assign to each data/random point. We will take an empirical 
approach to the solution of this problem and investigate the 
different estimators and weightings equally. 



3.1 The Methods of Estimation 

We will present results of the redshift space 2-point cor- 
relation function, £(s), where the redshift space separation 
between two points i and j is given by 



sl + s] 



2siSj cos 8, 



(2) 



where Si and Sj are the comoving redshift distances of the 
two points separated by an angle 9 on the sky (also see 
Fig. |7J). Therefore, we have assumed a qo — §, A = cos- 
mology with comoving distances given by 



Si 



m 



VT" 



(3) 



where Ho = 100/ikms _1 Mpc~ is the Hubble constant, c is 
the velocity of light in kms - and z the observed redshift. 

We calculate the radial selection function using stan- 
dard methods involving integrals over the galaxy luminosity 
function (e.g. Ratcliffe et al. 1996b). Random points are then 
distributed radially within the survey's angular limits with 
a probability proportional to the radial selection function, 
volume element and completeness rate of the survey. For 
the Durham/UKST survey this means distributing points 
within each of the 60 UKST fields separately because not 
only are the magnitude limits slightly different for each field 
(hence the radial selection function is slightly different) but 
the completeness rates are also slightly different. We have 
checked that this method of distributing the random poin ts 
does not cause any systematic biases in £, see Section 



3.2 



We then calculate the total number of data-data (DD), 
data-random (DR) and random-random (RR) pair counts in 
the survey and bin according to the pair separation of the 
points in question. We choose to bin our counts in O.ldex 
bins of separation starting at 0.1h~ Mpc. We calculate the 
2-point correlation function using three different estimators 
and two different weightings. The estimators investigated 
here are the standard estimator (e.g. Peebles 1980) 



DD(x) 



(-) 
\n D J 



£(x) = 

sv ; DR(x) \n Dl 

the estimator proposed by Hamilton (1993) 

_ DD(x)RR(x) _ 
tW " DR(x) 2 

and that of Landy & Szalay (1993) 

DD(x) - 2DR(x) + RR(x) 



ax 



RR(x) 



(4) 



(•») 



(6) 



The two weightings investigated here are a simple unit 
weighting 



w(r) 



(7) 



and the so-called minimum variance weighting (Efstathiou 
1988; Peebles 1973; Loveday et al. 1995) 



w(r, x) — 



1 



1 + A-Kn{r)J z (x)' 



(8) 



where n(r) is the radial number density and Js(x) = 
fn £,{y)v dy is the volume integral of the 2-point correla- 
tion function out to a separation x. These estimators are 
essentially Monte Carlo integrations over the spherical-shell 
shaped volumes of the bins. These methods are particularly 
useful at the edges of the survey where conventional integra- 
tion techniques are impractical. In order to reduce statistical 
fluctuations we use 25-50 times as many random points as 
there are data points. 

The standard estimator of equation stood for many 
years as the best estimate of £ from these types of sur- 
vey, with the RR — » DR difference from equation Ft] giv- 
ing a better estimate for the Monte Carlo volume integra- 
tion. However, this estimator is sensitive to the error in the 



mean density ( Hamilton 1993 ). Estimators which are sensi- 
tive to the square of the error in the mean density are those 
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proposed by Hamilton (1993) and Landy & Szalay (1993). 
Also, while weighting each data/random point equally is the 
simplest method, the pair count will be dominated by the 
structures in the survey near the peak of the radial num- 
ber density function. This weighting essentially reduces the 
effective volume of the survey as volumes are unequally sam- 
pled. To weight volumes equally one should weight by the 
inverse of the radial selection function. Unfortunately, such 
a weighting is dominated by the few galaxies at large dis- 
tances, where the selection function is small. Following on 
from work pioneered by Peebles (1973), Efstathiou (1988) 
has proposed a weighting which provides the mathematical 
minimum in the estimate of the variance of £. This weight- 
ing turns out to be a happy medium between equal pair 
weighting and equal volume weighting. To use Efstathiou's 
(1988) weighing we need an estimate of £, namely the quan- 
tity we are trying to calculate. This can be achieved via 
iteration but in practice a (ro/r) 7 power law model for £ 
suffices. We use the canonical values of ro = 5.0/i _1 Mpc 
and 7 = 1.8 (e.g. Peebles 1980). We include an upper limit 
of J™ ax = 5000/i~ 3 Mpc 3 in our weighting and find our es- 
timates of £ relatively insensitive to doubling/halving this 
value. 



3.2 Testing the Methods 

We have tested the reliability of the methods described in 
Section CO using mock catalogues of the Durham/UKST 
survey drawn from two sets of cold dark matter (CDM) 
JV-body simulations. These mock catalogues were con- 
structed in redshift space using the same angular/radial 
selection functions and completeness rates as the actual 
Durham/UKST survey. The CDM models used were (Ef- 
stathiou et al. 1985; Gaztanaga & Baugh 1995; Eke et al. 
1996): standard CDM with Mi = 0.5, b = 1.6 (SCDM); and 
CDM with Q.h — 0.2, b — 1 and a cosmological constant 
(A = 0.8) to ensure a spatially flat cosmology (LCDM). 
Each mock catalogue was selected in such a way as to sam- 
ple an independent volume of space from within the simula- 
tion. Given the relative SCDM and LCDM comoving cube 
sizes (256 and 378/i~ 1 Mpc), this implied that we could select 
a total of 18 SCDM mock catalogues from the 9 available 
SCDM simulations and 15 LCDM mock catalogues from the 
5 available LCDM simulations. 

Figs, hi and ti show the results of applying the six diff er- 

""to 



ent estimator and weighting combinations of Section 3.1 



the SCDM and LCDM mock catalogues, respectively. The 
circular, square and triangular symbols denote the estima- 
tors of equations kl H and H, respectively. Also, open sym- 
bols denote the unweighted estimates of equation M while 
closed symbols denote the weighted estimates of equation pi 
The dotted line on these plots is the same simple power 
law model and as such can be used as a reference point. 
The solid lines on these plots denotes the average of the 
actual redshift space 2-point correlation function, £(s), cal- 
culated directly from the full N-body simulations (SCDM 
and LCDM, respectively). Given that these are fully vol- 
ume limited fair samples containing iV galaxies with a well 
denned mean density, n, we use equation |l| and RR = 
{4n/3)nN(rl uter - rf nner ), where 



Tinner , Touter 



defines the 



scatter between the mock catalogues. We have assumed that 
each mock catalogue provides a statistically independent es- 
timate of £(s). Also, to aid graphical clarity we only plot 
alternate error bars from the three estimators. 

We also constructed a set of mock catalogues with con- 
stant completeness rates in each field. The results obtained 
were almost identical and therefore our method of distribut- 
ing the random points does indeed correct for the variable 
completeness rates of each field. Another set of mock cata- 
logues were constructed in real space rather than redshift 
space. Again, the results obtained were very similar and 
therefore the conclusions of this section are independent of 
real and redshift space effects. 

From the SCDM mock catalogue results in Fig. |l| we see 
that, on small scales (< 10/i -1 Mpc), all of the estimates can 
reproduce the actual £(s), although the weighted estimates 
are more accurate and show less scatter. On large scales (10- 
100/i _1 Mpc), all of the unweighted estimates agree well but 
appear biased low by ~ 0.03 in £. However, the weighted 
estimates trace the actual £(s) very well, except for the 
DD/DR — 1 estimator. These weighted estimates also have 
smaller error bars than the unweighted ones, again except 
for the DD/DR-1 estimator. On very large (> 100/i -1 Mpc) 
scales we do not expect the mock catalogues to produce be- 
lievable results given the survey geometry involved. 

We draw similar conclusions from the LCDM mock cat- 
alogue results of Fig. El This model has both a higher ampli- 
tude on small scales and more power on large scales than the 
SCDM model. This time the unweighted estimates are bi- 
ased low by ~ 0.08 on large (10-100ft _1 Mpc) scales. Again, 
the weighted estimates trace the actual £(s) well on these 
scales. However, the weighted DD.RR/DR 2 — 1 estimator 
very accurately describes £(s) on all scales and also has the 
smallest error bars. 



3.3 Errors and Biases in the Estimates 

The theoretical error in the 2-point correlation function on 
large/linear scales has been estimated by Peebles (1973); see 
also Kaiser (1986). Consider a wide bin containing N p data 
pairs in a single radial shell with observed number density 
n(r). Assuming that £ is small (-C 1) then the error in £(x) 
is given by 



A£(z) 



1 + 4nn(r)J 3 (x) 



(9) 



This is essentially a v N Poisson error modified for the ef- 
fects of clustering, which reduces the amount of independent 
information available. We measure the maximum value of 
4ttJ 3 for the SCDM and LCDM models to be ~7000 and 
17000/i _3 Mpc 3 , respectively. If we consider the survey as a 



whole then N„ 



l gali 



where n ga i is the total number of 



extent of the radial bin in question. The error bars shown 
are the la standard deviation obtained from the observed 



galaxies in the survey. This implies a minimum theoretical 
error of A£ ~ 0.002 and 0.007 in the SCDM and LCDM 
mock catalogues, respectively. However, in studies of QSO 
clustering Shanks & Boyle (1994) have empirically shown 
that the error in equation B only works well on scales where 
N p < n g ai ■ On scales where N p > n ga i a more realistic error 
estimate is given by A£ ~ 1/ ^/n ga i- Given that we observe 
N p ~ n ga i ~ 2500 on 5-10ft _1 Mpc scales, we expect a mini- 
mum error of A£ ~ 0.02 on scales larger than this. 
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Figure 1. Testing the methods of estimating the redshift space 2-point correlation function, £(s), from standard CDM mock catalogues 
which mimic the Durham/UKST survey. On all of these plots open symbols denote w = 1 unweighted estimates and closed symbols 
denote w = l/(l + 47rn(r) Js(x)) weighted estimates. Also, circular, square and triangular symbols denote the estimators of equations H, N 
and K], respectively. Figs, (a) and (c) are plotted on a log-log scale to emphasize the small scale features, while Figs, (b) and (d) are 
plotted on a log-linear scale to emphasize the large scale features. The dotted line on each plot is the same simple power law model and 
can be used as a reference point. The solid line is the redshift space 2-point correlation function calculated directly from the TV-body 
simulations which are used to construct the mock catalogues. Error bars are the lcr scatter seen between the mock catalogues assuming 
each one provides an independent estimate of £. To aid graphical clarity we plot the alternate error bars of the three estimators. 



A possible bias in the estimation of £ is due to the fact 
that we estimate both the mean density and the pair counts 
from the same survey. This leads to a non-zero difference be- 
tween the true £ of an ensemble of surveys and the ensemble 
average of the £'s from each survey. This is commonly known 
as the Integral Constraint (e.g. Peebles 1980) and is given 
by 



l + 4ivn(r)Jr ax 
n„ai 



(10) 



which should be added to £ in an ensemble of surveys. One 
can simplify this expression by assuming 1 <C 47rn(r) J™ ax 
and using n(r) ~ n ga i/V e ff to give 



(11) 



v; 



eff 
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Figure 2. The same as Fig. hi but for the mock catalogues constructed from the low-O CDM model with a non-zero A to ensure spatial 
flatness. 



where the effective volume sampled by the survey is given 
by 



Veff = 



f(r)dV, 



(12) 



and f(r) is a function which reflects the weighting of the 
galaxies. For example, if we weight pairs equally then / 
is just the radial selection function, while equal volume 
weighting implies that / is the inverse of the radial se- 
lection function. For a typical mock catalogue we calcu- 
late V e ff ~2x 10 5 /i _3 Mpc 3 for equal pair weighting and 
~ 4 x 10 6 ft _3 Mpc 3 for equal volume weighting. Recalling 
the maximum values of 47r J3 quoted previously we find that 
L ~ 0.035 and 0.085 for the SCDM and LCDM mock cat- 



alogues, respectively, when using equal pair weighting and 
I c ~ 0.002 and 0.004 when using equal volume weighting. 



3.4 Optimal Estimate 

In Section 573] the realistic minimum error in an individ- 
ual mock catalogue was estimated to be A£ ~ 0.02 on 
large scales, for both the SCDM and LCDM models. As 
an example, the error bars on an individual SCDM mock 
catalogue are given in Figs. 0(a) and 0(b). These plots 
show that this is a good estimate for the errors from 
both the weighted and unweighted DD.RR/DR 2 — 1 and 
(DD — 2DR + RR)/RR estimators and they asymptote 
towards this value on large scales (10-100/t -1 Mpc). How- 
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Figure 3. An example of the error bars (A£) on an individual 
mock catalogue for the standard CDM model. Fig. (a) shows the 
results from the unweighted versions of the 3 estimators, while 
Fig. (b) shows the corresponding weighted estimates. These error 
bars appear to asymptote to a value of A£ ~ 0.02 on the larger 
scales, in good agreement with our estimated minimum possible 
error bar. Very similar results were found for the LCDM model. 



ever, the most consistently small error bars are given by 
the weighted DD.RR/DR 2 - 1 and (DD -2DR + RR)/RR 
estimators. 

We can also compare any systematic biases in the es- 
timates of Figs. Il] and 3 with the predicted Integral Con- 
straint from Section 3.3. We see that all of the unweighted 
estimates suffer from a systematic bias which is in good 
agreement with the predictions from the Integral Constraint: 
-~0.03 compared with 0.035 for the SCDM mock catalogues; 



and ~0.08 compared with 0.085 for the LCDM mock cat- 
alogues. For the weighted estimates there is no noticeable 
Integral Constraint for either set of mock catalogues for 
the DD.RR/DR 2 - 1 and (DD - 2DR + RR)/RR esti- 
mators. Again, this is in good agreement with the small 
value predicted, namely < 0.005. We note that the weighted 
DD.RR/DR 2 — 1 estimate most accurately reproduces the 
actual £ of both the SCDM and LCDM models on all scales. 

Given the historical importance of the standard 
DD/DR— 1 estimator we briefly discuss the results obtained 
from it. Empirically we observe that the weighted estimate 
produces a larger error bar than the unweighted estimate. 
This is in direct contradiction with the fact that this weight- 
ing was constructed in order to produce the minimum vari- 
ance in £. This is only seen in the DD/DR— 1 estimates and 
therefore could be due to the estimator itself. This is possibly 
related to the fact that this estimator is sensitive to the error 
in the mean density which is different from the other esti- 
mators which are sensitive to the square of this error. Also, 
while the systematic bias in the unweighted estimate can be 
explained by the Integral Constraint, the observed bias in 
the weighted estimate remains unexplained. These results 
involving the DD/DR — 1 estimator are in good agreement 
with a similar study of pencil-beam surveys carried out by 
Fong, Hale-Sutton & Shanks (1991). 

To conclude this section we answer the question about 
which weighting and estimator combination used on a mock 
catalogue optimally reproduces the actual 2-point correla- 
tion function. We have found that all of our estimates ap- 
pear limited by a minimum error bar which comes directly 
from the number of galaxies in the survey. Also, the un- 
weighted estimates of £ are all systematically biased low by 
an amount predicted by the Integral Constraint. This is due 
to the fact that this equal pair weighting reduces the effec- 
tive volume of the survey. Finally, we see that the weight- 
ing/estimator combination which most accurately traces the 
actual £ and has the smallest error bars is given by the 
w = l/(l+4wn(r)Js(x)) weighting of Efstathiou (1988) and 
the DD.RR/DR 2 - 1 estimator of Hamilton (1993). This is 
what we call our optimal estimate of £ from a magnitude 
limited redshift survey. 



4 THE REDSHIFT SPACE GALAXY 2-POINT 
CORRELATION FUNCTION 

We estimate the redshift space 2-point correlation function, 
£(s), using the methods described in Section 0. We use 
the magnitude limits described in Ratcliffe et al. (1996b) 
which maximize depth and minimize observational incom- 
pleteness in the survey. Using these limits we have (miim) = 
16.86 ± 0.25 with an average completeness rate of 75 per 
cent. Section yq showed that the methods of estimation were 
able to account for the effects of having a slightly different 
magnitude limit and completeness rate in each of the 60 
UKST fields. 



4.1 Results from the Durham/UKST Survey 

Fig. H shows the results of applying these methods to the 
Durham/UKST Galaxy Redshift Survey. We use Hamilton's 
(1993) DD.RR/DR 2 — 1 estimator but show the results from 
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Figure 4. Estimates of the redshift space 2-point correlation 
function, £(s), from the Durham/UKST Galaxy Redshift Survey 
using Hamilton's (1993) estimator. Figs, (a) and (b) are plotted 
on log-log and log-linear scales to emphasize the small and large 
scale features, respectively. Open symbols denote the unweighted 
estimate and solid symbols denote the weighted one. The dotted 
line shows the canonical power law model, while the solid line 
shows the best fitting power law model to the weighted £(s) in 
the indicated range. 



both the w = 1 unweighted and the w = 1/ (l+Aim(r)Jz(x)) 
weighted estimates for clarity. Figs.M(a) andWlb) are plotted 
on log-log and log-linear scales to emphasize the small and 
large scale features, respectively. Open symbols denote the 
unweighted estimate and solid symbols denote the weighted 
one. The dotted line shows the canonical power law model 
for £ of (5.0/i _1 Mpc/s) , while the solid line shows the 



Table 1. Comparison of the best fit redshift space 2-point corre- 
lation function parameters from the Durham/UKST survey with 
recent galaxy redshift survey results and also previous Durham 
ones. 



Survey 


S -1 Mpc) 


7 


Durham/UKST 


6.8 ±0.3 


1.25 ±0.06 


APM-Stromlo 


5.9 ±0.3 


1.47 ±0.12 


Las Campanas 


6.8 ±1.1 


1.70 ±0.11 


DARS/SAAO 


6.5 ±0.5 


(1.8) 



best fitting power law model to the weighted £(s) in the 1- 
30/i -1 Mpc range. The error bars shown are the la standard 
deviation on an individual low-£7 + A CDM mock catalogue 
(LCDM). Obviously, these error bars use the same weight- 
ing/estimator combination as the data points in question. 

On all scales smaller than ~100/i _1 Mpc we see that 
the unweighted estimate is systematically lower than the 
weighted one. We have tested to see if this could be explained 
by any systematic errors in the photometry, the method of 
incompleteness correction or the errors in the measured red- 
shifts and found a negative result. It appears, quite simply, 
to be caused by the different weightings used. Therefore, it 
is thought to be partially statistical and partially due to the 
Integral Constraint. Indeed, using the value of J™ aa: esti- 
mated from the weighted £(s) in Fig. |j we find that equal 
pair weighting could cause an Integral Constraint of ~0.25 
in £. This is large enough to explain all of the observed dif- 
ference on > lO/i Mpc scales. Equal volume weighting has 
an estimated Integral Constraint of ~0.01 and is therefore 
not a problem for the weighted estimate. We fit our power 
law model using a minimum y 2 statistic and the best fit pa- 
rameters are presented in Table QJ. This gave a \ of ~10 for 
13 degrees of freedom, which is an adequate fit. Errors on 
these parameters come from the appropriate Av 2 contour 
about this minimum. However, given the correlated nature 
of these points, we anticipate that our quoted errors are 
more than likely an underestimate. This should be adequate 
for the simple comparison done here. 

Finally, given the results of Section ta, we favour the 
weighted £(s) presented here as our best estimate of the 
redshift space 2-point correlation function. 



4.2 Comparison with other Redshift Surveys 

Table \u also gives a comparison of the best fit power law 
parameters of the £(s) estimated from some recent optical 
galaxy redshift surveys (Loveday et al. 1992, 1995; Tucker et 
al. 1996) and also previous Durham ones (Shanks et al. 1983, 
1989). We see that the best fit redshift space correlation 
lengths, so, all agree well with a value of ~6.5/i _1 Mpc. How- 
ever, the slopes, 7, all differ significantly given the quoted 
error bars. (Note that the DARS/SAAO survey had 7 fixed 
at 1.8 during the fitting.) Therefore, while the amplitude of 
£(s) appears well determined, there is considerable scatter 
in the value of the redshift space slope from the currently 
available data sets. 

The results from these surveys are directly compared in 
Figs. g(a) and |5|(b) where they are plotted on log- log and log- 
linear scales to emphasize the small and large scale features, 
respectively. The error bars shown on the Durham/UKST 
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Figure 5. Comparison of the Durham/UKST redshift space 2- 
point correlation function, £(s), with that from recent optical 
galaxy redshift surveys (Loveday et al. 1992, 1995; Tucker et al. 
1996) and previous Durham surveys (Shanks et al. 1983, 1989). 
Figs, (a) and (b) are plotted on log-log and log-linear scales to 
emphasize the small and large scale features, respectively. 



estimate are again those from LCDM mock catalogues. On 
small scales (< 10ft -1 Mpc) we see that all of the estimates 
are very consistent. On larger scales (> 10h~ Mpc) we see 
that the new Durham/UKST estimate agrees well with the 
previously claimed detections of large scale power out to 
~40/i _1 Mpc by the APM-Stromlo and Las Campanas sur- 
veys. On even larger scales (> 50/i _1 Mpc) all of the sur- 
veys are consistent with zero. All of these £'s use the esti- 
mator of Hamilton (1993) and the weighting of Efstathiou 
(1988), apart from the previous Durham DARS/SAAO re- 



sults. These authors used the DD/DR — 1 estimator with 
a W — 1 weighting, but they did test against the pos- 
sibility that the integral constraint could be as large as 
implied by the w(9) found from the APM survey. Also, 
Fong et al. (1991) tested the effect of volume weighting the 
DARS/SAAO data and found that the correlation function 
estimate only rose slightly; they also found the increase in 
error from the combined used of volume weighting and the 
DD/DR — 1 estimator now reproduced in our analysis here 
(see Fig. |5J). Hence, we conclude that the reason that the 
DARS/SAAO results are biased low is partly due to the use 
of equal pair weighting but mainly due to statistical fluctu- 
ations in the early redshift survey data. 

Our conclusions from Table hi and Fig. tA are that a 
simple, one power law model does not give a good fit to 
the present data sets. However, the actual results from the 
different surveys do in fact agree well on all scales on a 
qualitative level, except for the DARS/SAAO results (which 
has large systematic errors). 



4.3 Comparison with Structure Formation Models 

We compare the redshift space 2-point correlation func- 
tion from the Durham/UKST survey with the predictions 
of two popular structure formation models. The models we 
use are those from the cold dark matter simulations of Sec- 
tion ra, namely the standard CDM model (SCDM) and the 
low-n + A CDM model (LCDM). Historically, the SCDM 
model has been the standard model of structure formation 

while the LCDM 



for over 10 years (e.g. Davis et al. 1985), 
model is a useful phenomenological model for recent large 
scale structure results (e.g. Loveday et al. 1992; Baugh & Ef- 
stathiou 1993). In Figs. ra(a) and Rib) we plot these results 
on log-log and log-linear scales to emphasize the small and 
large scale features, respectively. The shaded areas on Fig. M 
denote the 68 per cent confidence region on an individual 
mock catalogue, see Figs, hj and El Given that the compari- 
son here is to see how often the CDM mock catalogues can 
reproduce the Durham/UKST result (i.e. what is the scat- 
ter in the CDM estimates) we do not plot error bars on the 
Durham/UKST estimate. For consistency, all of the results 
presented in this figure were calculated using the optimal 
weighting/estimator combination of Efstathiou (1988) and 
Hamilton (1993). 

On small scales (< 10/i _1 Mpc) we see that both 
the CDM models agree well with the results from the 
Durham/UKST survey. On large scales (> 10/i _1 Mpc) the 
SCDM model shows no significant power above ~20/i _1 Mpc 
whereas the LCDM model shows significant power out to 
~30/i _1 Mpc. Therefore, the Durham/UKST {(s) has signif- 
icant power (> 3er) above and beyond the SCDM model up 
to ~40ft _1 Mpc. While the LCDM model is more consistent 
with the data, it also produces too little power in this region 
at the l-2a level. This rejection of SCDM is consistent with 
the findings from the APM galaxy survey (Maddox et al. 
1990; L oveday et al. 1992) an d the QDOT infrared redshift 
survey (Saunders et al. 1991). 
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Figure 6. Comparison of the Durham/UKST redshift space 2- 
point correlation function, £(s), with the results calculated from 
two models of structure formation, namely the standard CDM 
model (SCDM) and the low-f2 + A CDM model (LCDM). Mock 
catalogues which mimic the Durham/UKST survey were selected 
from the ./V-body simulations of these models and the shaded ar- 
eas denote the 68 per cent confidence regions in £(s) as estimated 
for an individual mock catalogue. Figs, (a) and (b) are plotted 
on log-log and log-linear scales to emphasize the small and large 
scale features, respectively. 



5 THE PROJECTED GALAXY 2-POINT 
CORRELATION FUNCTION 

Surveys which use measured redshifts to estimate distances 
have the problem that the actual clustering pattern is im- 
printed with the galaxy peculiar velocity field. Specifically, 
our distance estimates are distorted by the non Hubble-flow 







o 


i 


""tp ^ ~J^9 


J 


71 

1 


Ve/2 

____in h 


9/2/ i 

d. c£i_ 







Figure 7. Schematic diagram to show the definitions we use to 
calculate the separations perpendicular (a) and parallel (tt) to 
the line of sight of two points i and j . 



component of the galaxy peculiar velocity in the line of sight 
direction. Therefore, while our fundamental interest (in clus- 
tering terms) is in the real space 2-point correlation function, 
only the redshift space 2-point correlation function is directly 
observable from our survey. However, it is possible to model 
the correlation function such that we can estimate it as a 
simple real space power law. 

We define the projected 2-point correlation function, 
w v (a), by (e.g. Peebles 1980) 

w v (a) = / £(cr, 7r)d7r, (13) 



= 2 / £{cj,-k)<M. (14) 

Jo 

where £(cr, tt) is our usual 2-point correlation function, but 
calculated as a function of the separations perpendicular (a) 
and parallel (tt) to the line of sight. The definitions of a and 
it we use are schematically shown in Fig. m. We found that 
our results do not depend significantly on the exact nature 
of these definitions and even the small angle approximation 
gives reasonably consistent results. 



5.1 Modelling the Projected Correlation Function 

The projected nature of equation ud allows one to write 



w v (a) = 2 / f(Vcr 2 +7r 2 )d7r, 



(15) 



where £(\/a 2 + it 2 ) is the real space 2-point correlation func- 
tion. Assuming a power law { (r) = (ro/r) 7 with r 2 = a 2 +iv 2 
and using the definition of the Beta function gives 



Wv(<j) 



r(i) 



r (i-y) 



(16) 
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Figure 8. Estimates of the projected 2-point correlation func- 
tion for (a) the SCDM model and (b) the LCDM model. The 
dotted line denotes the power law model for w v (cr) predicted by 
equation hq. The solid line denotes the results of estimating w v (a) 
directly from the ./V-body simulations using equation [17]. The solid 
points are the mean w v (ex) for the mock catalogues as estimated 
from equation [17]. The error bars plotted are the lc standard 
deviation on an individual mock catalogue. 



where T(x) is the Gamma function and 7 > 1 is assumed. 
Therefore, we can fit for our measured w v (a) to estimate a 
power law model of £(r). 



5.2 The Methods and Tests of the Methods 

Our method for estimating £(<x, 7r) is the same as in Sec- 
tions M and W, except that we now bin counts in two vari- 



ables instead of just one. The estimate of £(cr, n) becomes 
noisy at very large scales and so we truncate the integral in 
equation [14] at some upper limit, iv cu t 



w v (a) 



r-Ti 

'I 



£,(a,n)dn. 



(17) 



In practise we use a ir cu t of 30/i -1 Mpc for all our calculations 
and our results are insensitive to raising this value. This 
integral is carried out using a simple mid-point integration 
scheme which is quite adequate given the uncertainties in 
f(ff,7r). 

We test these methods by using the CDM iV-body sim- 
ulations and mock catalogues of Section H. Firstly, although 
not shown here, we have estimated the real space 2-point 
correlation function, £(r), directly from the iV-body simu- 
lations in the same manner as we estimated the actual red- 
shift space 2-point correlation function for Figs, h] and H. 
We find that the SCDM model is approximately fit by a 
r ~ 5.0/i _1 Mpc, 7 ~ 2.2 power law out to ~20/i _1 Mpc 
scales. Similarly, the LCDM model has approximate param- 
eters of ro ~ 6.0/i _1 Mpc, 7 ~ 2.2 out to ~30/i -1 Mpc scales. 
These values of ro and 7 are then used in equation hq to 
predict w v (a) power laws of ~ 95.7cr -1 ' 2 and 142.9a -1 ' for 
the SCDM and LCDM models, respectively. Secondly, we 
estimate the redshift space £(er, 7r) from each iV-body simu- 
lation directly and average to obtain the best estimate pos- 
sible for each CDM model. Using these two £(er, 7r)'s we then 
estimate w v (a) from equation MJ for the SCDM and LCDM 
models, respectively. Finally, we estimate f(cr, 7r) from each 
mock catalogue using the optimal weighting/estimator com- 
bination of Efstathiou (1988) and Hamilton (1993). Many 
estimates of w v (a) are obtained from equation pj and then 
averaged to produce the mean estimate from the SCDM and 
LCDM mock catalogues, respectively. 

We plot these three sets of results on Fig. KKa) for the 
SCDM model and Fig. |(b) for the LCDM model. The dot- 
ted line denotes the power law model for w v (a) predicted 
by equation hq. The solid line denotes the results of esti- 
mating Wy(cr) directly from the iV-body simulations using 
equation hjj The solid points are the mean w v (a) for the 
mock catalogues as estimated from equation hjj. The error 
bars on these points are the la standard deviation on an 
individual mock catalogue as calculated from the scatter 
between the mock catalogues. Looking at Fig. 0(a) we see 
that the w v (a) estimated from the SCDM mock catalogues 
(using the optimal weighting/estimator of Section 0) does 
accurately reproduce the w v (a) estimated directly from the 
SCDM iV-body simulations. Also, we see that the power law 
predictions of equation UM give good agreement with the esti- 
mated w v (a) out to ~20/7- _1 Mpc scales, which is the scale at 
which the power law approximation for the SCDM £(r) was 
seen to break down in the iV-body simulations. We can make 
similar comments regarding the LCDM results in Fig. H(b), 
namely that the mock catalogues trace the expected w v (a) 
and the predicted power law model is a good approximation 
out to ~30/i -1 Mpc scales, where the LCDM power law £(r) 
breaks down. 

To conclude these tests of the methods we state that the 
mock catalogues do produce the expected projected 2-point 
correlation function from the iV-body simulations. Also, this 
method can self-consistently reproduce the power law form 
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Table 2. Comparison of the best fit real space 2-point correlation 
function parameters from the Durham/UKST survey with recent 
galaxy redshift survey results and also previous Durham ones. 



Figure 9. Estimates of the projected 2-point correlation func- 
tion, w v (cr), from the Durham/UKST Galaxy Redshift Survey 
using Hamilton's (1993) estimator. Open symbols denote the un- 
weighted estimate while solid symbols denote the weighted one. 
The solid line shows the best fitting £(r) power law model to the 
weighted w v (a) in the indicated range. 

of the real space 2-point correlation function from £(<r, n) 
via the projected 2-point correlation function. 



5.3 Results from the Durham/UKST Survey 

Fig. y shows the results of applying these methods to the 
Durham/UKST Galaxy Redshift Survey. We use Hamilton's 
(1993) DD.RR/DR 2 - 1 estimator to calculate £(cx,7r) but, 
for clarity, show the results for both the unweighted estimate 
of equation Q (open symbols) and the weighted estimate of 
equation H (solid symbols). The solid line shows the best 
fitting power law model from equation hq to the weighted 
w v (a) in the 0.25-10/i -1 Mpc range. The error bars shown 
are the la standard deviation on an individual LCDM mock 
catalogue. Obviously, these error bars use the same weight- 
ing/estimator combination as the data points in question. 

We see that the unweighted estimate is systematically 
lower than the weighted one. This is a direct result of the 
weighted redshift space 2-point correlation function being 
higher than the unweighted one (see Fig. kj). The power law 
approximation of equation MS is fit using a minimum y 2 
statistic and the best fit parameters are presented in Table H 
This gave a \ 2 of ~8 for 12 degrees of freedom, which is an 
adequate fit. Errors on these parameters come from the ap- 
propriate A^ 2 contour about this minimum. However, given 
the correlated nature of these points, we anticipate that our 
quoted errors are more than likely an underestimate. Again, 
this should be adequate for the simple comparison done here. 

Table |2J also gives a comparison of the best fit £(r) power 
law model parameters to the w v (a) estimated from some re- 
cent optical galaxy redshift surveys (Loveday et al. 1995; 
Lin et al. 1996) and also previous Durham ones (Bean et al. 



Survey 


r (h _1 Mpc) 


7 


Durham/UKST 


5.1 ±0.3 


1.60 ±0.10 


APM-Stromlo 


5.1 ±0.2 


1.71 ±0.05 


Las Campanas 


5.0 ±0.14 


1.79 ±0.04 


DARS/SAAO 


4.7 ±0.4 


(1.8) 



1983; Hale-Sutton et al. 1989). We see that the best fit real 
space correlation lengths, ro, all agree well with a value of 
~5.0/i -1 Mpc. Also, the slopes, 7, all agree quite well with 
a value of ~1.75, bar the Durham/UKST one which is 1-2<t 
low. (Again the DARS/SAAO survey had 7 fixed at 1.8 dur- 
ing the fitting.) We find consistent results when comparing 
with the ro — 4.5/i _1 Mpc and 7 ~ 1.7 obtained by Baugh 
(1996) from numerically inverting the APM angular corre- 
lation function, w(9). 

Our conclusion from Table hi is that a simple one power 
law model gives both an adequate fit and consistent results 
from present data sets. 



6 CONCLUSIONS 

We have empirically determined the optimal method of es- 
timating the 2-point correlation function from a magnitude 
limited galaxy redshift survey. Our method used Monte 
Carlo techniques on mock catalogues drawn from iV-body 
simulations of cold dark matter structure formation mod- 
els. From the currently available choices of 2-point corre- 
lation function estimator and weighting we find that both 
the minumum variance and most accurate reproduction of 
the 2-point correlation function is given by the estimator of 
Hamilton (1993) and the weighting of Efstathiou (1988). 

These techniques are then applied to the Durham/ 
UKST Galaxy Redshift Survey and the redshift space 2- 
point correlation function is calculated for this survey. We 
find that our results agree well with those from other recent 
redshift surveys and confirm the previously claimed detec- 
tions of large scale power in the 10-40/i _1 Mpc regime (e.g. 
Loveday et al. 1992, 1995). A simple power law model is 
an adequate fit to the data (although not particularly im- 
pressive) and has redshift space parameters of correlation 
length, r = 6.8 ± O.S/i^Mpc, and slope, 7 = -1.25 ± 0.06. 
At small scales these results agree with the results from 
previous Durham pencil-beam surveys (Shanks et al. 1983, 
1989). However, at large (r > 10h~ 1 Mpc) scales the older 
surveys suggested too litre power, mainly due to statistical 
fluctuations, with some smaller contribution due to integral 
constraint. 

We compare our results with the predictions of two com- 
mon models of structure formation, namely the standard 
cold dark matter model, Q,h = 0.5 & b = 1.6 (SCDM), 
and a low density cold dark matter model with a non-zero 
cosmological constant to ensure spatial flatness, Qh — 0.2, 
A = 0.8 & b = 1.0 (LCDM). Our results agree well with 
both of these models on small scales < 10/i _1 Mpc but on 
larger scales we find our results are > 3a above and beyond 
the SCDM model in the 10-40/i _1 Mpc region. The LCDM 
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model is more consistent with our results but is still l-2cr 
low. 

Given that our survey uses redshifts as distance esti- 
mates our measured clustering statistics are distorted by 
the peculiar velocity field. Using standard techniques (e.g. 
Peebles 1980) we calculate the projected 2-point correla- 
tion function and use it to model the real space 2-point 
correlation function. We find that a simple power law model 
provides an adequate fit to the projected 2-point correla- 
tion function from the Durham/UKST survey which im- 
plies real space parameters of correlation length, ro = 
5.1 ±0.3ft _1 Mpc, and slope, 7 = —1.6 ±0.1. The differences 
seen in the real and redshift space parameters is discussed 
in the next paper in this series. 
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